[1] FALSE TRUE FALSE TRUE FALSE TRUE FALSE FALSE TRUE TRUE
Assignment schemes
Experiments are investigations in which an intervention, in all its essential elements, is under the control of the investigator. (Cox & Reid)
Two major types of control:
In general you might want to set things up so that your randomization is replicable. You can do this by setting a seed:
Even better is to set it up so that it can reproduce lots of possible draws so that you can check the propensities for each unit.
[1] 0.495 0.493 0.512 0.522 0.505 0.522 0.479 0.510 0.444 0.518
Here the \(P\) matrix gives 1000 possible ways of allocating 5 of 10 units to treatment. We can then confirm that the average propensity is 0.5.
A survey dictionary with results from a complex randomization presented in a simple way for enumerators
People often wonder: did randomization work? Common practice is to implement a set of \(t\)-tests to see if there is balance. This makes no sense.
If you doubt whether it was implemented properly do an \(F\) test. If you worry about variance specify controls in advance as a function of relation with outcomes (more on this later). If you worry about conditional bias then look at substantive differences between groups, not \(t\)–tests
If you want realizations to have particular properties: build it into the scheme in advance.
Note: clusters are part of your design, not part of the world.
Often used if intervention has to function at the cluster level or if outcome defined at the cluster level.
Disadvantage: loss of statistical power
However: perfectly possible to assign some treatments at cluster level and then other treatments at the individual level
Principle: (unless you are worried about spillovers) generally make clusters as small as possible
Principle: Surprisingly, variability in cluster size makes analysis harder.
Be clear about whether you believe effects are operating at the cluster level or at the individual level. This matters for power calculations.
Be clear about whether spillover effects operate only within clusters or also across them. If within only you might be able to interpret treatment as the effect of being in a treated cluster…
Surprisingly, if clusters are of different sizes the difference in means estimator is not unbiased, even if all units are assigned to treatment with the same probability.
Here’s the intuition.Say there are two clusters each with homogeneous treatment effects:
| Cluster | Size | Y0 | Y1 |
|---|---|---|---|
| 1 | 1000000 | 0 | 1 |
| 2 | 1 | 0 | 0 |
Then: What is the true average treatment effect? What do you expect to estimate from cluster random assignment?
The solution is to block by cluster size. For more see: http://gking.harvard.edu/files/cluster.pdf
There are more or less efficient ways to randomize.
Consider a case with four units and two strata. There are 6 possible assignments of 2 units to treatment:
| ID | X | Y(0) | Y(1) | R1 | R2 | R3 | R4 | R5 | R6 |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 |
| 2 | 1 | 0 | 1 | 1 | 0 | 0 | 1 | 1 | 0 |
| 3 | 2 | 1 | 2 | 0 | 1 | 0 | 1 | 0 | 1 |
| 4 | 2 | 1 | 2 | 0 | 0 | 1 | 0 | 1 | 1 |
| – | – | – | – | – | – | – | – | – | – |
| \(\widehat{\tau}\): | 0 | 1 | 1 | 1 | 1 | 2 |
Even with a constant treatment effect and everything uniform within blocks, there is variance in the estimation of \(\widehat{\tau}\). This can be eliminated by excluding R1 and R6.
Simple blocking in R (5 pairs):
| 1 | 2 | 3 | 4 | 5 |
|---|---|---|---|---|
| TRUE | TRUE | TRUE | FALSE | FALSE |
| FALSE | FALSE | FALSE | TRUE | TRUE |
DeclareDesign can help with this.| \(T2=0\) | \(T2=1\) | |
|---|---|---|
| T1 = 0 | \(50\%\) | \(0\%\) |
| T1 = 1 | \(50\%\) | \(0\%\) |
| \(T2=0\) | \(T2=1\) | |
|---|---|---|
| T1 = 0 | \(25\%\) | \(25\%\) |
| T1 = 1 | \(25\%\) | \(25\%\) |
| \(T2=0\) | \(T2=1\) | |
|---|---|---|
| T1 = 0 | \(33.3\%\) | \(33.3\%\) |
| T1 = 1 | \(33.3\%\) | \(0\%\) |
In practice if you have a lot of treatments it can be hard to do full factorial designs – there may be too many combinations.
In such cases people use fractional factorial designs, like the one below (5 treatments but only 8 units!)
| Variation | T1 | T2 | T3 | T4 | T5 |
|---|---|---|---|---|---|
| 1 | 0 | 0 | 0 | 1 | 1 |
| 2 | 0 | 0 | 1 | 0 | 0 |
| 3 | 0 | 1 | 0 | 0 | 1 |
| 4 | 0 | 1 | 1 | 1 | 0 |
| 5 | 1 | 0 | 0 | 1 | 0 |
| 6 | 1 | 0 | 1 | 0 | 1 |
| 7 | 1 | 1 | 0 | 0 | 0 |
| 8 | 1 | 1 | 1 | 1 | 1 |
Then randomly assign units to rows. Note columns might also be blocking covariates.
In R, look at library(survey)
| Unit | T1 | T2 | T3 | T4 | T5 |
|---|---|---|---|---|---|
| 1 | 0 | 0 | 0 | 1 | 1 |
| 2 | 0 | 0 | 1 | 0 | 0 |
| 3 | 0 | 1 | 0 | 0 | 1 |
| 4 | 0 | 1 | 1 | 1 | 0 |
| 5 | 1 | 0 | 0 | 1 | 0 |
| 6 | 1 | 0 | 1 | 0 | 1 |
| 7 | 1 | 1 | 0 | 0 | 0 |
| 8 | 1 | 1 | 1 | 1 | 1 |
library(survey)Muralidharan, Romero, and Wüthrich (2023) write:
Factorial designs are widely used to study multiple treatments in one experiment. While t-tests using a fully-saturated “long” model provide valid inferences, “short” model t-tests (that ignore interactions) yield higher power if interactions are zero, but incorrect inferences otherwise. Of 27 factorial experiments published in top-5 journals (2007–2017), 19 use the short model. After including interactions, over half of their results lose significance. […]
Anything to be done on randomization to address external validity concerns?
DeclareDesignA design with hierarchical data and different assignment schemes.
design <-
declare_model(
school = add_level(N = 16,
u_school = rnorm(N, mean = 0)),
classroom = add_level(N = 4,
u_classroom = rnorm(N, mean = 0)),
student = add_level(N = 20,
u_student = rnorm(N, mean = 0))
) +
declare_model(
potential_outcomes(Y ~ .1*Z + u_classroom + u_student + u_school)
) +
declare_assignment(Z = simple_ra(N)) +
declare_measurement(Y = reveal_outcomes(Y ~ Z)) +
declare_inquiry(ATE = mean(Y_Z_1 - Y_Z_0)) +
declare_estimator(Y ~ Z, .method = difference_in_means) Here are the first couple of rows and columns of the resulting data frame.
| school | u_school | classroom | u_classroom | student | u_student | Y_Z_0 | Y_Z_1 | Z | Y |
|---|---|---|---|---|---|---|---|---|---|
| 01 | -0.77 | 01 | -0.06 | 0001 | 0.36 | -0.48 | -0.38 | 0 | -0.48 |
| 01 | -0.77 | 01 | -0.06 | 0002 | 0.16 | -0.67 | -0.57 | 0 | -0.67 |
| 01 | -0.77 | 01 | -0.06 | 0003 | 1.04 | 0.21 | 0.31 | 1 | 0.31 |
| 01 | -0.77 | 01 | -0.06 | 0004 | 1.54 | 0.70 | 0.80 | 0 | 0.70 |
| 01 | -0.77 | 01 | -0.06 | 0005 | -0.99 | -1.82 | -1.72 | 0 | -1.82 |
| 01 | -0.77 | 01 | -0.06 | 0006 | -0.70 | -1.53 | -1.43 | 0 | -1.53 |
Here is the distribution between treatment and control:
We can draw a new set of data and look at the number of subjects in the treatment and control groups.
But what if all students in a given class have to be assigned the same treatment?
assignment_clustered <-
declare_assignment(Z = cluster_ra(clusters = classroom))
estimator_clustered <-
declare_estimator(Y ~ Z, clusters = classroom,
.method = difference_in_means)
design_clustered <-
design |>
replace_step("assignment", assignment_clustered) |>
replace_step("estimator", estimator_clustered)assignment_clustered_blocked <-
declare_assignment(Z = block_and_cluster_ra(blocks = school,
clusters = classroom))
estimator_clustered_blocked <-
declare_estimator(Y ~ Z, blocks = school, clusters = classroom,
.method = difference_in_means)
design_clustered_blocked <-
design |>
replace_step("assignment", assignment_clustered_blocked) |>
replace_step("estimator", estimator_clustered_blocked)| Design | Power | Coverage |
|---|---|---|
| simple | 0.16 | 0.95 |
| (0.01) | (0.01) | |
| complete | 0.20 | 0.96 |
| (0.01) | (0.01) | |
| blocked | 0.42 | 0.95 |
| (0.01) | (0.01) | |
| clustered | 0.06 | 0.96 |
| (0.01) | (0.01) | |
| clustered_blocked | 0.08 | 0.96 |
| (0.01) | (0.01) |
In many designs you seek to assign an integer number of subjects to treatment from some set.
Sometimes however your assignment targets are not integers.
Example:
Two strategies:
Can also be used to set targets
# remotes::install_github("macartan/probra")
library(probra)
set.seed(1)
fabricate(N = 4, size = c(47, 53, 87, 25), n_treated = prob_ra(.5*size)) %>%
janitor::adorn_totals("row") |>
kable(caption = "Setting targets to get 50% targets with minimal variance")| ID | size | n_treated |
|---|---|---|
| 1 | 47 | 23 |
| 2 | 53 | 27 |
| 3 | 87 | 43 |
| 4 | 25 | 13 |
| Total | 212 | 106 |
Indirect control
Indirect assignments are generally generated by applying a direct assignment and then figuring our an implied indirect assignment
Error in mean(indirect_propensities): object 'indirect_propensities' not found
Looks better: but there are trade offs between the direct and indirect distributions
Figuring out the optimal procedure requires full diagnosis